Supervised outlier detection for classification and regression

نویسندگان

چکیده

Outlier detection, i.e., the task of detecting points that are markedly different from data sample, is an important challenge in machine learning. When a model built, these special can skew training and result less accurate predictions. Due to this fact, it identify remove them before building any supervised often first step when dealing with learning problem. Nowadays, there exists very large number outlier detector algorithms provide good results, but their main drawbacks unsupervised nature together hyperparameters must be properly set for obtaining performance. In work, new estimator proposed. This done by pipelining following model, such way targets later supervise how all involved optimally selected. pipeline-based approach makes easy combine detectors classifiers regressors. experiments done, nine relevant have been combined three regressors over eight regression problems as well two another binary multi-class classification problems. The usefulness proposal objective automatic determine has proven effectiveness also analyzed compared.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection by Boosting Regression Trees

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...

متن کامل

Granular Box Regression Methods for Outlier Detection

Granular computing (GrC) is an emerging computing paradigm of information processing. It concerns the processing of complex information entities called information granules, which arise in the process of data abstraction and derivation of knowledge from information. Granular computing is more a theoretical perspective, it encourages an approach to data that recognizes and exploits the knowledge...

متن کامل

Efficient Classification Technique for Outlier Detection

Outliers are the data objects that clearly differ in their behavior from the normal data. Outlier detection mainly aims at finding these data objects. Outlier detection has become the major area of research in data mining. This plays a crucial role in data mining. Most of the methods used for outlier detection, consider the positive data and their behavior, and then the data violating the behav...

متن کامل

Classification Based Outlier Detection Techniques

Outlier detection is an important research area forming part of many application domains. Specific application domains call for specific detection techniques, while the more generic ones can be applied in a large number of scenarios with good results. This survey tries to provide a structured and comprehensive overview of the research on Classification Based Outlier Detection listing out variou...

متن کامل

Outlier Diagnostics in Logistic Regression: A Supervised Learning Technique

The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. Logistic regression is one of the most popular supervised learning technique that is used in classification. Fields like computer vision, image analysis and engineering sciences frequently encounter data with outliers (noise). Presence of outliers in the training sampl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neurocomputing

سال: 2022

ISSN: ['0925-2312', '1872-8286']

DOI: https://doi.org/10.1016/j.neucom.2022.02.047